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Abstract. The problems studied in this article originate from the 
Graph Motif problem introduced by Lacroix et al. [17] in the con- 
text of biological networks. The problem is to decide if a vertex-colored 
graph has a connected subgraph whose colors equal a given multiset of 
colors M. Using an algebraic framework recently introduced by Koutis 
et al. [15,16], we obtain new FPT algorithms for Graph Motif and 
variants, with improved running times. We also obtain results on the 
counting versions of this problem, showing that the counting problem is 
FPT if M is a set, but becomes #W[l]-hard if M is a multiset with two 
colors. 

1 Introduction 

An emerging field in the modern biology is the study of the biological networks, 
which represent the interactions between biological elements [1]. A network is 
modeled by a vertex-colored graph, where nodes represent the biological com- 
pounds, edges represent their interactions, and colors represent functionalities of 
the graph nodes. Networks are often analyzed by studying their network motifs, 
which are defined as small recurring subnetworks. Motifs generally correspond 
to a set of elements realizing a same function, and which may have been evo- 
lutionarily preserved. Therefore, the discovery and the querying of motifs is a 
crucial problem [20] , since it can help to decompose the network into functional 
modules, to identify conserved elements, and to transfer biological knowledge 
across species. 

The initial definition of network motifs involves conservation of the topology 
and of the node labels; hence, looking for topological motifs is roughly equivalent 
to subgraph isomorphism, and thus is a computationally difficult problem. How- 
ever, in some situations, the topology is not known or is irrelevant, which leads 
to searching for functional motifs instead of topological ones. In this setting, we 
still ask for the conservation of the node labels, but we replace topology conser- 
vation by the weaker requirement that the subnetwork should form a connected 
subgraph of the target graph. This approach was advocated by [17] and led to 
the definition of the Graph Motif problem [10]: given a vertex-colored graph 
G = (V, E) and a multiset of colors M, find a set V' C V such that the induced 
subgraph G[V] is connected, and the multiset of colors of the vertices of V is 



equal to M. In the literature, a distinction is made between the colorful case 
(when M is a set), and the multiset case (when M is an arbitrary multiset). 
Although this problem has been introduced for biological motivations, [3] points 
out that it may also be used in social or technical networks. 

Not surprisingly, Graph Motif is NP-hard, even if G is a bipartite graph 
with maximum degree 4 and M is built over two colors only [10]. The problem 
is still NP-hard if G is a tree, but in this case it can be solved in 0(n 2c+2 ) time, 
where c is the number of distinct colors in M, while being W[l]-hard for the 
parameter c [10]. The difficulty of this problem is counterbalanced by its fixed- 
parameter tractability when the parameter is k, the size of the solution [17, 10, 
3]. The currently fastest FPT algorithms for the problem run in 0*(2 k ) time 
for the colorful case, 0*(4.32 fe ) time for the multiset case, and use exponential 
space 3 . 

Our contribution is twofold. First, we consider in Section 3 the decision ver- 
sions of the Graph Motif problem, as well as some variants: we obtain improved 
FPT algorithms for these problems, by using the algebraic framework of mul- 
tilinear detection for arithmetic circuits [15,16], presented in the next section. 
Second, we investigate in Section 4 the counting versions of the Graph Motif 
problem: instead of deciding if a motif appears in the graph, we now want to 
count the occurrences of this motif. This allows to assess if a motif is over- or 
under- represented in the network, by comparing the actual count of the motif 
to its expected count under a null hypothesis [19]. We show that the counting 
problem is FPT in the colorful case, but becomes #W[l]-hard for the multiset 
case with two colors. We refer the reader to [12,11] for definitions related to 
parameterized counting classes. 

2 Definitions 

This section contains definitions related to arithmetic circuits, and to the Multi- 
linear Detection (MLD) problem. It concludes by stating Theorem 1, which 
will be used throughout the paper. 

2.1 Arithmetic circuits 

In the following, a capital letter X will denote a set of variables, and a lower- 
case letter x will denote a single variable. If A is a set of variables and A is a 
commutative ring, we denote by A[A] the ring of multivariate polynomials with 
coefficients in A and involving variables of A. Given a monomial m = x\...Xk 
in A[A], where the XiS are variables, its degree is k, and m is multilinear iff its 
variables are distinct. 

An arithmetic circuit over A is a pair C = (C, r), where C is a labeled directed 
acyclic graph (dag) such that (i) the children of each node are totally ordered, 

3 We use the notations O* and O to suppress polynomial and polylogarithmic factors, 
respectively. 
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(ii) the nodes are labeled either by op G {+, x } or by an element of A, (iii) no 
internal node is labeled by an element of A, and where r is a distinguished node 
of C called the root. We denote by Vc the set of nodes of C, and for a given 
node u we denote by Nc(u) the set of children (i.e. out-neighbors) of u in C. 
We recall that a node u is called a leaf of C iff Nc(u) = 0, an internal node 
otherwise. We denote by T(C) the size of C (defined as the number of arcs), and 
we denote by S(C) the number of nodes of C of indegree > 2. 

Given a commutative ring A, evaluating C over A under a mapping <p : X — > A 
consists in computing, for each node u of C, a value val(u) £ A as follows: 1. for 
a leaf u labeled by x G X, we let val(u) — 4>(x), 2. for an internal node u labeled 
by + (rcsp. x), we compute val(u) as the sum (resp. product) of the values of 
its children. The result of the evaluation is then val(r). The symbolic evaluation 
of C is the polynomial Pq G Z[A] obtained by evaluating C over 1\X\ under the 
identity mapping <fi : X — > 7L\X\. 

We stress that the above definition of arithmetic circuits does not allow con- 
stants, a restriction which is necessary for the algorithms. However, we can safely 
allow the two constants Oa and 1a, the zero and the unit of A (which is assumed 
to be a unital ring). For simplicity, these two constants will be represented by 
an empty sum and an empty product, respectively. 

2.2 Multilinear Detection 

Informally, the Multilinear Detection problem asks, for a given arithmetic 
circuit C and an integer k, if the polynomial Pc has a multilinear monomial 
of degree k. However, this definition docs not give a certificate checkable in 
polynomial-time, so for technical reasons we define the problem differently. 

A monomial- subtree of C is a pair T = (C',0), where C = (C',r r ) is an 
arithmetic circuit over X whose underlying dag C is a directed tree, and where 
<j> : Vc — > Vc is such that (i) <p(r') = r, (ii) if u G Vc is labeled by x G X, then 
so is <f>(u), (iii) if u G Vc is labeled by + then so is (f>(u), and Nc(u) consists of a 
single element v G Nc(4>(u)), (iv) if u G Vc is labeled by x, then so is <p(u), and 
4> maps bijectively Nc{u) into Nc(<j>(u)) by preserving the ordering on siblings. 
The variables of T are the leaves of C labeled by variables in X. We say that 
T is distinctly-labeled iff its variables are distinct. 

Intuitively, a monomial-subtree tells us how to construct a multilinear from 
the circuit: Condition (i) tells us to start at the root, Condition (iii) tells us 
that when reaching a + node we are only allowed to pick one child, and Condi- 
tion (iv) tells us that when reaching a x node we have to pick all children. The 
(distinctly-labeled) monomial-subtrees of C with k variables will then correspond 
to the (multilinear) monomials of Pc having degree k. Therefore, we formulate 
the Multilinear Detection problem as follows: 

Name: Multilinear Detection (MLD) 

Input: An arithmetic circuit C over a set of variables X, an integer k 
Solution: A distinctly-labeled monomial-subtree of C with k variables. 
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Solving MLD amounts to decide if Pq has a multilinear monomial of degree k 
(observe that there are no possible cancellations), and solving #MLD amounts 
to compute the sum of the coefficients of multilinear monomials of Pq having 
degree k. The restriction of MLD when \X\ = k is called Exact Multilinear 
Detection (XMLD). In this article, we will rely on the following far-reaching 
result from [21, 16] to obtain new algorithms for Graph Motif: 

Theorem 1 ([21,16]). MLD can be solved by a randomized algorithm which 
uses 6{2 k T(C)) time and 0(S(C)) space. 

3 Finding vertex-colored subtrees 

In this section, we consider several variants of the Graph Motif problem, 
and we obtain improved FPT algorithms for these problems by reduction to 
MLD. Notably, we obtain 0*(2 k ) time algorithms for problems involving colorful 
motifs, and 0*(4 fc ) time algorithms for multiset motifs. 

3.1 The colorful case 

In the colorful formulation of the problem, the graph is vertex-colored, and we 
seek a subtree with k vertices having distinct colors. This leads to the following 
formal definition: 

Name: Colorful Graph Motif (CGM) 

Input: A graph G = (V, E), k e N, a set C, a function \ '■ V -> C 

Solution: A subtree T = (V T ,E T ) of G s.t. (i) |Vr| = k and (ii) for each 

u,v e Vt distinct, x( u ) x{ v )- 

The restriction of Colorful Graph Motif when |C| = k is called Exact 
Colorful Graph Motif (XCGM). Note that this restriction requires that the 
vertices of T are bijectively labeled by the colors of C. In [7], the XCGM problem 
was shown to be solvable in 0*{2 k ) time and space, while it is not difficult to 
see that the general CGM problem can be solved in 0*((2e) k ) time and 0*(2 k ) 
space by color-coding. By using a reduction to Multilinear Detection, we 
improve upon these complexities. In the following, we let n and m denote the 
number of vertices and the number of edges of G, respectively. 

Proposition 1. CGM is solvable by a randomized algorithm in 0(2 k k 2 m) time 
and O(kn) space. 

Proof (Sketch). Let I be an instance of CGM. We construct the following circuit 
Cj: its set of variables is {x c : c G C}, and we introduce intermediary nodes Pi tU 
for 1 < i < k, u <E V, as well as a root node P. Informally, the multilinear 
monomials of Pi tU will correspond to colorful subtrees of G having i vertices, 
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including u. The definitions are as follows: 

i-l 

P%,u ^ ^ ^ ^ Pi' .uPi—i' ,v if 2 > 1, P\,u — ^x( w ) 

i' = l veN G (u) 

and P = X^uev Pk,u- The resulting instance of MLD is I' = (Cj, fc). By applying 
Theorem 1, and by observing that T(C/) = 0(k 2 m) and S*(C/) = 0(kn), we 
solve /' in (D(2 k k 2 m) time and O(kn) space. The correctness of the construction 
follows by showing by induction on 1 < i < k that: x Cl ...x Cd is a multilinear 
monomial of Pj iU iff (i) d = i and (ii) there exists T = (Vr, Et) colorful subtree 
of G such that ueVt and x(Vr) = {ci, ...,Cd}. □ 

3.2 The multiset case 

We consider the multiset formulation of the problem: we now allow some colors 
to be repeated but impose a maximum number of occurrences for each color. 
This problem can be seen as a generalization of the original Graph Motif 
problem. 

Given a multiset M over a set A, and given an element x E A, we denote 
by um{x) the number of occurrences of i in M. Given two multisets M,M', 
we denote their inclusion by M C M'. We denote by \M\ the size of M, where 
elements are counted with their multiplicities. Given two sets A, B, a function 
/ : A — > B and a multiset X over A, we let f(X) denote the multiset containing 
the elements f(x) for x G X, counted with multiplicities; precisely, given y E B 
we have n f(x) (y) = T, x eA:f(x)= v n x(x)- 

We now define the following two variants of Colorful Graph Motif, 
which allow for multiset motifs: 

Name: Multiset Graph Motif (MGM) 

Input: A graph G = (V,E), an integer k, a set C, a function \ : V — > C, a 
multiset M over C. 

Solution: A subtree T = (V T ,E T ) of G s.t. (i) \V T \ = k and (ii) X (V T ) C M. 

Name: Multiset Graph Motif With Gaps (MGMG) 

Input: A graph G — (V,E), integers k,r, a set C, a function x : F — > C, a 

multiset M over C. 

Solution: A subtree T = (Vt,Et) of G s.t. (i) \Vr\ < r and (ii) there exists 
S C Vr of size fc such that X (S) C M. 

The restriction of Multiset Graph Motif when \M\ = k is called Exact 
Multiset Graph Motif (XMGM). Note that in this case we require that 
T contains every occurrence of M, i.e. x(Vr) = M. In this way, the XMGM 
problem coincides with the Graph Motif problem defined in [10,3], while the 
MGM problem is the parameterized version of the Max Motif problem consid- 
ered in [9]. The notion of gaps is introduced in [17], and encompasses the notion 
of insertions and deletions of [7] . 
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Previous algorithms for these problems relied on color-coding [2] ; these algo- 
rithms usually have an exponential space complexity, and a high time complexity. 
For the Graph Motif problem, [10] gives a randomized algorithm with an im- 
plicit 0(87 k km) running time, while [3] describes a first randomized algorithm 
running in 0(8.16 fc m), and shows a second algorithm with C(4.32 fe fc 2 m) running 
time, using two different speed-up techniques ([4] and [13]). For the Max Mo- 
tif problem, [9] presents a randomized algorithm with an implicit 0((32e 2 ) fc fcm) 
running time. Here again, we can apply Theorem 1 to improve the time and space 
complexities: 

Proposition 2. 1. MGM is solvable by a randomized algorithm in 0{A k k 2 m) 
time and 0{kn) space. 
2. MGMG is solvable by a randomized algorithm in 0(4 k r 2 m) time and 0(rn) 
space. 

Proof. Point 1. We modify the circuit of Proposition 1 as follows. For each color 
c G C with Um(c) = m, we introduce variables y c ,i, ■ ■■,yc,m, and we introduce a 
node Q c = y c ^ + ... + y c , m - For each vertex u G V, we introduce a variable x u , 
and we define: 

i-1 

Pi,u — ^ ^ ^ ^ Pi' ,uPi—i' ,v if i ^ 1? P\,u •EuQx(u) 

i' = l v€N G (u) 

and P — J2uev Pk,u- Note that we changed only the base case in the recurrence 
of Proposition 1. The intuition is that the variables x u will ensure that we choose 
different vertices to construct the tree, and that the variables y c ^ will ensure that 
a given color cannot occur more than required. The resulting instance of MLD 
is V = (C 7 ,2fc), and since T(Cj) = 0(k 2 m) and S{d) = 0(kn), we solve it 
in the claimed bounds by Theorem 1. A similar induction as in Proposition 1 
shows that: for every 1 < i < k, a multilinear monomial of Pi M has the form 
x Vl y Cl j 1 ...x Vi y Ci j i , and it is present iff there is a subtree (Vt, Et) of G such that 
u e V T , V t = {v U -,Vi} and X (V T ) = {{ci, cj} C M. 

Point 2. We modify the construction of Point 1 by now setting P\ tU = 1 + 
x uQ x ( u ) f° r c&ch u G V, and P = ^2 ueV Y^i=i ^»,u- Informally, adding the 
constant 1 to each P\^ u permits to ignore some vertices of the subtree, allowing 
to only select a set S of k vertices such that x(5*) C M. The correctness of 
the construction is shown by a similar induction as above. The catch here is 
that when considering two trees 7\, T 2 obtained from i-V jU , Pi-i>, v , their selected 
vertices will be distinct, but they may have "ignored" vertices in common; we 
can then find a subset of E(Ti) U E(T 2 ) U {uv} which forms a tree containing all 
selected vertices from T\, T 2 . □ 

3.3 Edge- weighted versions 

We consider an edge-weighted variant of the problem, where the subtree is now 
required to have a given total weight, in addition to respecting the color con- 
straints. This variant has been studied in [6] under the name Edge- Weighted 
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Graph Motif. In our case, we define two problems, depending on whether we 
consider colorful or multiset motifs. 

Name: Weighted Colorful Graph Motif (WCGM) 

Input: A complete graph G = (V, E), a function \ : V — > C, a weight function 

w : E — > N, integers k, r 

Solution: A subtree T = (Vt,Et) of G such that (i) \Vt\ = k, (ii) \ is injective 
on V T , (hi) Eee£ T w ( e ) ^ r - 

Name: Weighted Multiset Graph Motif (WMGM) 

Input: A complete graph G = (V,E), a function x : V — > C, a weight function 

w : E — > N, integers k, r, a multiset M 

Solution: A subtree T — (Vr, Et) of G such that (i) \V T \ = k, (ii) \{Vt) Q M , 
(iii) E ee B T w ( e ) < r - 

We observe that the WMGM problem contains as a special case the MlN- 
CC problem introduced in [8], which seeks a subgraph respecting the multiset 
motif, and having at most r connected components. Indeed, we can easily reduce 
Min-CC to WMGM: given the graph G, we construct a complete graph G' with 
the same vertex set, and we assign a weight to edges of G, and a weight 1 to 
non-edges of G. 

Proposition 3. 1. WCGM is solvable by a randomized algorithm in 

0(2 k k 2 r 2 m) time and O(krn) space. 
2. WMGM is solvable by a randomized algorithm in 0(4 k k 2 r 2 m) time and 
O(krn) space. 

Proof. We only prove 1, since 2 relies on the same modification as in Proposition 
2. The construction of the arithmetic circuit is similar to the construction in 
Proposition 1. The set of variables is {x c : c € C}, and we introduce nodes 
Pi,j, u , for 1 < i < k and < j < r, whose multilinear monomials will correspond 
to colorful subtrees having i vertices including u, and with total weight < j. The 
definitions are as follows: 

i-1 j-w(uv) 

Pi,j,u = ^ ' ^ ^ ^ ' Pi' ,j' ,uPi—i' ,j—j' —w(uv),v if * > 1 
i' = l v£V j'=0 

and P = J2uev Pk.r.u- The resulting instance of MLD is /' = (Cj, k), and since 
T(Ci) = 0(k 2 r 2 m) and S(Ci) = O(krn), we solve it in the claimed bounds 
by Theorem 1. The correctness of the construction follows by showing that: 
given 1 < i < k,0 < j < r, u G V, x Cl ...x Cd is a multilinear monomial of 
Pi,jM iff (i) d = i and (ii) there exists T = (Vt,Et) colorful subtree of G with 
u e V T , x(Vt) = {ci, c d } and J2 e eE T w ( e ) - 3- a 
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4 Counting vertex-colored subtrees 

In this section, we consider the counting versions of the problems XCGM and 
XMGM introduced in Section 3. For the former, we show that its counting 
version #XCGM is FPT; for the latter, we prove that its counting version 
#XMGM is #W[l]-hard. 

4.1 FPT algorithms for the colorful case 

We show that #XCGM is fixed-parameter tractable (Proposition 5). We rely 
on a general result for #XMLD (Proposition 4), which uses inclusion-exclusion 
as in [14]. 

Say that a circuit C is k-bounded iff Pc has only monomials of degree < k. 
Observe that given a circuit C, we can efficiently transform it in a fc-bounded 
circuit C such that (i) C and C have the same monomials of degree k, (ii) 
\C'\ < (k + 1) 2 |C|; the details of the construction are omitted 4 . The following 
result shows that we can efficiently count solutions for fc-boundcd circuits with 
k variables (and thus for general circuits, with an extra 0(k 2 ) factor in the 
complexity). 

Proposition 4. #XMLD for k-bounded circuits is solvable in 0{2 k T(C)) time 
and 0{S(C)) space. 

Proof. Let C be the input circuit on a set X of k variables. For a monomial 
m let Var(m) denote its set of variables. Given S C X, let Ns, resp. N' s , be 
the number of monomials m of Pc such that Var(m) — S, resp. Var(m) C S. 
Observe that for every S C X, we have N' s = X^tcs Nt- Therefore, by Mobius 
inversion it holds that for every S C X, N s = J2tcs(~^ S ^ n t- 

Since C is fc-bounded, Nx is the number of multilinear monomials of Pc 
having degree k. Now, each value N' s can be computed by evaluating C under 
the mapping <j> : X — > Z defined by <fr(v) = 1 if v € S, 4>{v) — if v S. By 
the Mobius inversion formula, we can thus compute the desired value Nx in 
0(2 fe T(C)) time and 0(S(C)) space. □ 

It is worth mentioning that Proposition 4 generalizes several counting al- 
gorithms based on inclusion-exclusion, such as the well-known algorithm for 
#Hamiltonian Path of [14], as well as results of [18]. Indeed, the problems 
considered in these articles can be reduced to counting multilinear monomials of 
degree n for circuits with n variables (where n is usually the number of vertices 
of the graph), which leads to algorithms running in 0*(2 n ) time and polynomial 
space. 

Let us now turn to applying Proposition 4 to the #XCGM problem. Recall 
that we defined in Proposition 1 a circuit C/ for the general CGM problem; we 
will have to modify it slightly for the purpose of counting solutions. 

4 The idea is to assume w.l.o.g. that C has outdegree 2. Then, we create k + 1 copies 
«o, u k of each node u of C, such that the monomials of m correspond to the degree 
j-monomials of u. If r is the root node of C, then r k becomes the root node of C 
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Proposition 5. ^XCGM is solvable in 0(2 k k 3 m) time and 0(k 2 n) space. 

Proof. Let / be an instance of XCGM. A rooted solution for I is a pair (u, T) 
where T is a solution of XCGM on / and u is a vertex of T (which should 
be seen as the root of the tree). The solutions of XCGM on / are also called 
unrooted solutions. Let N r (I) and N U (I) be the number of rooted, resp. unrooted, 
solutions for /. We will show how to compute N r (I) in the claimed time and 
space bounds; since N U (I) = ^P-, the result will follow. 

To compute N r , observe first that we cannot apply Proposition 4 to the circuit 
Ci of Proposition 1. Indeed, the circuit Ci counts the ordered subtrees, and not 
the unordered ones. Therefore, we need to modify the circuit in the following 
way: at each vertex v of Vr, we examine its children by increasing color. This 
leads us to define the following circuit C'f suppose w.l.o.g. that C = {l,...,fc}, 
introduce nodes Pi,j, u for each 1 < i < k, 1 < j < k + 1, u E V, variables Xi for 
each 1 < i < k, and define: 

Pi,j,u = x x (u), Pi,j,u = if i > 2, j = k + 1 

i-l 

Pi,j,u = Pi,j + l,u ~\~ ^ ' ^ ' Pi' ,j + l,uPi — i',l,v if « > 2, 1 < 7 < fc 

i'=l v£N G (u): X (v)=i 

Let us also introduce a root node P = J2 u ev Pk,i,u- Given 1 < i,j < k and 
u E V, let <Si,j, u denote the set of pairs (u, T) where (i) T is a properly colored 
subtree of I containing u and having i vertices, (ii) the neighbors of u in T 
have colors > j. It can be shown by induction on i that: there is a bijection 
between Sij tU and the multilinear monomials of Pij, u - Therefore, the number 
of multilinear monomials of P is equal to N r ; since T(C'j) = 0(k 3 m), S(C'j) = 
0(k 2 n) and since C\ is /c-bounded, it follows by Proposition 4 that N r can be 
computed in 0(2 k k 3 m) time and (D(k 2 n) space. □ 

4.2 Hardness of the multiset case 

In this subsection, we show that #XMGM is #W[l]-hard. For convenience, we 
first restate the problem in terms of vertex-distinct embedded subtrees. 

Let G = (V,E) and H = (V',E') be two multigraphs. An homomorphism 
of G into H is a pair 4> = (4>v,(f>E) where <f>v ■ V — > V and <f>E ■ E — > £", 
such that if e G E has endpoints x,y then 4>e{^) has endpoints <j>y (x), 4>v {y)- 
An embedded subtree of G is denoted by T = {T , <j)y , 4> e) where T = (Vt,Et) 
is a tree, and (<j)y, 4>e) is an homomorphism from T into G. We say that T is a 
vertex- distinct embedded subtree of G (a "vdst" of G) if 4>v is injective. We say 
T is an edge-distinct embedded subtree of G (an "edst" of G) iff 4>e is injective. 
We restate XMGM as follows: 

Name: Exact Multiset Graph Motif (XMGM) 

Input: A graph G = (V,E), an integer k, a set C, a function \ '■ V ~ ► C, a 
multiset M over C s.t. \M\ = k. 
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Solution: A vdst (T, <f> v , <j> E ) of G s.t. \ ° <f>v(Vr) = M. 



We first show the hardness of two intermediate problems (Lemma 1). Before 
defining these problems, we need the following notions. Consider a multigraph 
G = (V,E). Consider a partition V of V into Vi,...,Vk, and a tuple t £ [r] k . 
A (V,t) -mapping from a set A is an injection ip : A — > V x [r] such that for 
every x E A, if VK X ) = (u, i) with u E Vj, then 1 < i < tj. From we 
define its reduction as the function ip r : A — > V defined by tp r (x) — v whenever 
ip(x) — (v, i). We also define a tuple T{ip) = (m, n^) E [r] fc such that for each 
i E [k], rii = max^gv^ \{x E A : ip r {x) = v}\. 

Given two tuples t, t' E [r] k , denote t < t' iff U < t\ for each i E [k]. Note 
that for a (V, t)-mapping -0, we always have T(ip) < t since ip is injective. We 
say that a (P, t) -labeled edst for G is a tuple (T, i^v^e) where (i) T = (W, £?t) 
is a tree, (ii) is a ("P, i)-mapping from Vr, (hi) (T,i[)y,ipE) is an edst of G. 
Our intermediate problems are defined as follows: 

Name: Multicolored Embedded Subtree-1 (MEST - 1) 

Input: Integers k,r, a fc-partite multigraph G with partition "P, a tuple t E [r] fc 

Solution: A (V, i)-labeled edst (T, Vv, V'e) for G s.t. |F T | = r and T(ip v ) = t. 

The MEST — 2 problem is defined similarly, except that we do not require 
that T(ipv) — t (and thus we only have T(ipv) < t). While we will only need 
#MEST - 2 in our reduction for #XMGM, we first show the hardness of 
#MEST - 1, then reduce it to #MEST - 2. 

Lemma 1. #MEST-1 and#MEST-2 are #VJ[l]-hard for parameter (k,r). 

The proof is omitted due to space constraints. 

Proposition 6. #XMGM is #W[1] -hard for parameter k. 

Proof. We reduce from #MEST — 2, and conclude using Lemma 1. Let I = 
(k,r,G,t) be an instance of #MEST — 2, where G — (V,E) is a multigraph, 
and let Si be its set of solutions. From G, we construct a graph H as follows: 
(i) we subdivide each edge e E E, creating a new vertex a[e], (ii) we substitute 
each vertex v E Vi by an independent set formed by ti vertices b[v, 1], b[v, U]. 
We let A be the set of vertices a[e] and B the set of vertices b[v, i], we therefore 
have a bipartite graph H = (A U B, F). We let /' = (H, 2r - 1, C, \, M), where 
C = {1, 2}, x maps A to 1 and B to 2, and M consists of r — 1 occurrences of 1 
and r occurrences of 2. 

Then I' is our resulting instance of #XMGM, and we let Si' be its set of 
solutions. Notice that by definition of \ an d M, Si> is the set of vdst (T, (f>y, <f>E) 
of H containing r — 1 vertices mapped to A and r vertices mapped to B. We 
now show that we have a parsimonious reduction, by describing a bijection <P : 
Si — > Si'. Consider T = (T,ipy ,iPe) in Si; we define <P(T) — (T 1 \<f>v ,<Pe) as 
follows: 
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— For each edge e = uv £ E(T), we have f e := Vs(e) S E(G): we then 
subdivide e, creating a new vertex x e . Let T" be the resulting tree; 

— For each vertex x e , we define 4>v(x e ) = &[/e]- For each other vertex u of T", 
we have u £ V(T), let (u, i) = Vv(u); we then set 4>v(u) — b[v,i] (this is 
possible since if v £ Vj then 1 < i < tj, by definition of Vv)- 

From we then define 4>e in a natural way. Then T' = ^(T) is indeed in 5//: 
(i) T' is a vertex distinct subtree of H (by definition of <py and since T was edge- 
distinct, the values <pv{xe) are distinct; by injectivity of Vv, the other values 
4>v(u) are distinct); (ii) it has r — 1 vertices mapped to A and r vertices mapped 
to -B. To prove that <P is a bijection, we describe the inverse correspondence 
<F : 5// -> 5/. Consider T = (T', 0y, fe) in S 7 ,; we define f (T') = (T, Vv, fe) 
as follows. Let A', £?' be the vertices of T' mapped to A, B respectively. Let i be 
the number of nodes of A' which are leaves: since the nodes of A' have degree 1 
or 2 in T" depending on whether they are leaves or internal nodes, we then have 
\E{T')\ < i + 2(r - 1 - i) = 2r - i - 2; since \E{T')\ = 2r - 2, we must have 
i = 0. It follows that all leaves of T" belong to from T', by contracting each 
vertex of A' in T' we obtain a tree T with r vertices. We then define ipv,ipE 
as follows: (i) given u £ B', if <pv{ u ) — j] , then tpv{u) — {v,j); (ii) given 
e = uv £ E(T), there corresponds two edges ux, vx £ E(T') with x £ A', and we 
thus have (j>v{x) = a[f], from which we define fe(e) = /. It is easily seen that 
the resulting T = ^(T 7 ) is in 5/, and that the operations and !^ are inverse 
of each other. □ 



5 Conclusion 



In this paper, we have obtained improved FPT algorithms for several variants of 
the Graph Motif problem. Reducing to the Multilinear Detection prob- 
lem resulted in faster running times and a polynomial space complexity. We 
have also considered the counting versions of these problems, for the first time 
in the literature. Our results demonstrate that the algebraic framework of [16] 
has potential applications to computational biology, though a practical evalua- 
tion of the algorithms remains to be done. In particular, how do they compare 
to implementations based on color-coding or ILPs [7, 5]? 

We conclude with some open questions. A first question concerns our results 
of Section 3.2 for multiset motifs: is it possible to further reduce the 0*(4 fe ) 
running times? Another question relates to the edge-weighted problems consid- 
ered in Section 3.3: our algorithms are only pseudopolynomial in the maximum 
weight r, can this dependence in r be improved? Finally, is approximate count- 
ing possible for the #XMGM problem? We believe that some of these questions 
may be solved through an extension of the algebraic framework of Koutis and 
Williams. 
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6 Appendix 



6.1 End of proof of Proposition 1 

Given a set S C C, define the multilinear monomial its := Ilces^c- Given 
m G V(T) and 5 C C, an (u, S)- solution is a subtree T = (Vt,Et) of G, such 
that u £ Vt, T is distinctly colored by \, and x(Vr) = S. We show by induction 
on 1 < i < k that: ns is a multilinear monomial of P ijU iff (i) \S\ = i and (ii) 
there exists an (it, S')-solution. This is clear when i = 1; now, suppose that i > 2, 
and assume that the property holds for every 1 < j < i. 

Suppose that \S\ = i and that T — (Vt,Et) is an (u, <S')-solution, let us 
show that its is a multilinear monomial of P^ u . Let v be a neighbor of u in T, 
then removing the edge uv from T produces two trees T\ , T 2 with Ti containing 
u and T 2 containing v. These two trees are distinctly colored, let S\,S 2 be 
their respective color sets, and let ii,i 2 be their respective sizes. Since T\ is an 
(u, Si)-solution, tts 1 is a multilinear monomial of by induction hypothesis. 
Since T 2 is a (v, S*2)-solution, tts 2 is a multilinear monomial of Pi 2;t , by induction 
hypothesis. It follows that its — tts 1 ^s 2 i s a multilinear monomial of Pi liM Pj 2; „, 
and thus of Pi jU . 

Conversely, suppose that ns is a multilinear monomial of Pj iU . By definition 
of Pi,tt, there exists 1 < i' < i — 1 and w £ Ng(u) such that 7rs is a multi- 
linear monomial of Pj/ ;U Pj_j' ; „. We can then partition S 1 into 51,5*2, with tts 1 
multilinear monomial of Pi' tU and tts 2 multilinear monomial of Pi-y , v . Induction 
hypothesis therefore implies that (i) |Si| = i' and l^l =i — i', (ii) there exists an 
(u,5i)-solution T x = {V^Ei) and a (v, 5 2 )-solution T 2 = (V 2 ,E 2 ). Since Si,S 2 
are disjoint, it follows that \S\ = i, which proves (i); besides, V\, V 2 are disjoint, 
and thus T = (Vi U V 2 ,E\ U E 2 U {to}) is an (u, 5)-solution, which proves (ii). 

6.2 Proof of Lemma 1 

We first reduce ^Multicolored Clique to #MEST- 1. Our source problem 
^Multicolored Clique is the counting version of Multicolored Clique, 
which is easily seen to be #W[l]-hard. Let I = (G, k) be an instance of the 
problem, where G — (V, E) has a partition V into classes V\, Vfc. Our target 
instance is /' = (k, r, H, t) with r = k 2 — k + 1 and t = (k, k — 1, k — 1). The 
graph H is obtained by splitting every edge e in two parallel edges; then H is 
a fc-partitc multigraph with partition V . Let Si, Si> be the solution sets of / 
and /' respectively. Let ICk be the multigraph with k vertices 1, ...,fc, and with 
two parallel edges between distinct vertices; its partition is Vk consisting of the 
sets {1}, ...,{k}. Let Uk denote the set of {Vk , £)-labeled edsts (T, tpv, ^Pe) for 
ICk such that T(Vv) = t. Observe that Uk ^ 0: since every vertex of ICk has 
degree 2(k — 1), it follows that ICk has an Eulerian path starting at 1, which 
visits k times the vertex 1, and each other vertex k — 1 times. We claim that 
|<S//| = |Wfc||5j|, which will prove the correctness of the reduction. To this aim, 
we will describe a bijection <P : Si> x Uk — > Si. 
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Consider a pair P = (G, T) G Si> x £4 with T = (T^v^e) and C = 
{«!,..., Wfe} multicolored clique of G (with G Vi). Let = (4>v,<Pe) be the 
homomorphism of /Cfe into H which maps i to Vi, and the parallel edges ac- 
cordingly. We then dehne T = ${P) by T = (T,ip' v ,ip' E ), where (i) ip' v is 
defined so that if tpv( u ) = ( v , i) and if <jV(t>) — w then ip' v (u) = (w,i), (ii) 
iP'e = ^e We verify that T' G 5/: indeed, it is a (V, i)Tabeled edst of G 
and T(-0y) = t (since we have composed with injective functions <f>y, 4>e)- To 
prove that ^ is a bijection, we define the inverse function \P : Si — > iSp x Uk 
as follows. Consider V = (T, ip' v ,ip' E ) (V, t)-labeled edst of G, with T(ip' v ) = t. 
This equality yields vertices v\ G Vi,...,Ufc G Vk such that KV'y) -1 ^)! = U- 
Let G = {wi,..., Vk}, then G is a multicolored clique of G: indeed, £f[C] has 
at most k 2 — k edges, and since iP'e ^ s injective it must have exactly fc 2 — k 
edges, implying that G[G] is a complete graph. We can then define (i^v^e) 
from (ip' v ,ip E ) by "projecting" on i, and the parallel edges accordingly (for 
instance, if tp' v (u) — (vi,j) then ipv(u) = We finally define P = <^(7~') 

by P — (G, T) where T = (T, tpy, tpE)- It is easy to see that P G <Sj' xlit, and 
that ^ and ^ are inverse of each other. 

We now give a Turing-reduction of # M E S T - 1 to # M E S T - 2 . Given a tuple 
t G [r] fc , we define the instance It = (k,r,G,t), and we let St,S' t be its solution 
sets for #MEST - 1, #MEST - 2 respectively. Let JV t = \S t | and AT t ' = |<S t '|. We 
have for every t G [r] fc : iV t ' = X)t'<t ^t'> which yields by Mobius inversion that 
for every t G [r] h : N t = J2t'<t M*>*')-^t' 5 - Therefore, we can compute a value 
AT* using C(2 fe ) oracle calls for #MEST - 2, thereby solving #MEST - 1. □ 



5 where fi(t,t') is if there exists i G [k] s.t. U — t\ > 1, and is otherwise equal to 
( — l) r where r is the number of i G [fc] s.t. U — t'i = 1. 
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